Correlation and Sampling in Relational Data Mining

نویسندگان

  • David Jensen
  • Jennifer Neville
چکیده

Data mining in relational data poses unique opportunities and challenges. In particular, relational autocorrelation provides an opportunity to increase the predictive power of statistical models, but it can also mislead investigators using traditional sampling approaches to evaluate data mining algorithms. We investigate the problem and provide new sampling approaches that correct the bias associated with traditional sampling.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows

Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...

متن کامل

CoDS: A Representative Sampling Method for Relational Databases

Database sampling has become a popular approach to handle large amounts of data in a wide range of application areas such as data mining or approximate query evaluation. Using database samples is a potential solution when using the entire database is not cost-effective, and a balance between the accuracy of the results and the computational cost of the process applied on the large data set is p...

متن کامل

پرخاشگری رابطه‌ای در کودکان پیش‌دبستانی

AbstractObjectives: This study aimed to investigate relational aggression in the preschool children in Shiraz as it causes harmful events for both the aggressive child and the other children. Method: In a descriptive cross-sectional survey, 258 children (119 boys, 139 girls) aged 3 to 7 years completed a 10-itemed questionnaire in the field of relational aggression for preschool children-teache...

متن کامل

Analyzing Correlation between Internationalization Orientation and Social Network

 The research on social networks and collaborative strategies has highlighted from the mid of 1980 which has contributed to the success and development of firms. The relationship and communication with trade partners in overseas help success of firms in entering to foreign markets and improve new partners and new markets abroad. Doing firm internationalization in foreign countries faces some ba...

متن کامل

A Resampling Technique for Relational Data Graphs

Resampling (a.k.a. bootstrapping) is a computationallyintensive statistical technique for estimating the sampling distribution of an estimator. Resampling is used in many machine learning algorithms, including ensemble methods, active learning, and feature selection. Resampling techniques generate pseudosamples from an underlying population by sampling with replacement from a single sample data...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001